Sec 2. Basic concepts
1)physical systems、自由度、不完全驱动、非完整约束、模式切换、分段连续;
2)interactive perception
- in order for: estimate property; predict effects of action;
- can be used as:self-supervised learning
- how to do:active learning(贯穿多处,可以用于转移模型与policy的学习)
3)Hierarchical Task Decompositions and Skill Reusability
自上而下层层分解任务【化繁为简】,技能重用
4)object-centric generalization
generalization via objects—both across different objects, and between similar (or identical) objects in different task instances
Note: 实际上较为困难,就是要在不同的物体上进行泛化。e.g. 在机器人层面考虑用柔性抓适应各种物体,或者在object层面抽象出general级别的rerpesentation。
Sec 3.形式化结构Formalizaiton
目的: 总概整个task family:
A task family is a distribution, P(M), over MDPs, each of which is a task.
note:skill: higher-level actions option:
Sec 4. 定义与学习状态空间 define and learn state and context space
1)object representaion:
A.简介:within-task or across-task(context)
B.具体类型:pose、shape、material、interaction or relative property
C.HIERARCHIES:point;part;object level(底层->高层整体)
e.g.
- pixel level(contact point、segmentation);
- a mug can be seen as having an opening for pouring, a bowl for containing, a handle for grasping, and a bottom for placing;
- block stack (方块的堆叠) groups of objects;
2)method:passive and interacive perception
e.g. camera、human immitation V.S. interaction by sensor
3)steps:discover object;ensure freedom;estimate object property
Note:active learning approaches are often used to select informative actions for quickly determining the model parameters
Sec 5 .transition model
1)General form
A deterministic function or a stochastic distribution
2)Types:continous; discrete; hybrid model
The discrete components of the state are often used to capture high-level task information while the continuous components capture low-level state information.
Key pt:continous model
My view:e.g. action 6dof(x,y,z,rx,ry,rz) -> continuous;state:object pose 同理
3)随机性(开门不一定开的成功)和不确定性(多点额外数据信息即可)
4)how to learn: Self-supervision and Exploration
sample: act then observe effect to get(s, a, s')
• Random sampling.
• Active sampling approaches can be used to select action samples that are the most informative .
• Intrinsic motivation: actively attempts to discover novel scenarios where its model currently performs poorly or that result in salient events.